Robust Sequential Prediction in Linear Regression with Student's t-distribution
نویسندگان
چکیده
The Predictive Least Squares (PLS) model selection criterion is known to be consistent in the context of linear regression. For small sample sizes, however, it can exhibit erratic behavior. We show that this shortcoming can be amended by incorporating a Student’s t-distribution into PLS. The resulting criterion is shown to be asymptotically equivalent to PLS but significantly more robust for small sample sizes. A scale parameter involved with the t-distribution can be used to incorporate an estimate of the scale of the noise; it is shown that the new criterion is robust with regard to the choice of this parameter and that its effect disappears asymptotically. The recently proposed Sequentially Normalized Least Squares (SNLS) criterion can be written in a form that exposes a similar interpretation with the exception that the scale parameter of the t-distribution is estimated sequentially from the data. Numerical experiments are presented; they indicate that using a Student’s t-distribution enhances model selection performance and that the benefit of the scale estimator of SNLS is negligible. Introduction Linear regression has recently received attention in the sequential or online setting, where work has been done in selecting a subset of the covariates (Määttä, Schmidt, and Roos 2015) and finding a predictor that minimizes the worstcase regret (Bartlett et al. 2015). The probabilistic case that we consider also fits to the prequential framework of Dawid (1984). In this article, we concentrate on the subset selection problem, also called the model selection problem. We assume a fixed design matrix Zn ∈ Rn×q that consists of the row vectors z1, z2, . . . ,zn. Associated with each sample zt, we have a response yt ∈ R. The goal is to select a non-empty subset of the covariates, γ ⊆ {1, 2, . . . , q}, that strikes a good balance between underfitting (poor prediction of the training data) and overfitting (poor generalization for future data). In order to assess the performance of subset selection methods, one often introduces the assumption that the data (y1:n,Zn) comes from the linear model yt = ztβ + εt, (1) where β ∈ R is a fixed coefficient vector and the εt’s are i.i.d. noise with E[εt] = 0 and E[εt ] = σ 2 < ∞. In this setting, a subset selection method is said to be consistent if its probability of selecting the γ that corresponds exactly to the non-zero elements of β will approach one when the sample size n tends to infinity. This γ is referred to as the true model or subset. For the batch case, where the score of a subset γ cannot be represented in a sequential manner, there are numerous methods (McQuarrie and Tsai 1998). Perhaps the most wellknown of these is the Bayesian Information Criterion (Akaike 1978; Schwarz 1978), or BIC, which is also known as the Schwarz Information Criterion (SIC). For the model (1), the BIC criterion is BIC(y1:n,Zn, γ) := n log ( σ̂ n,γ ) + |γ| log n, (2) where |γ| is the cardinality of γ and σ̂ n,γ := 1 n n ∑ t=1 ( yi − ztβ̂ n )2 . (3) Here and later β̂ n ∈ R denotes the maximum likelihood estimate of β computed using the first n samples and with the restriction that the entries of β̂ n that are not present in γ are forced to be zeros. As for information criteria based on sequential prediction, we are aware of only two (besides Bayesian methods that admit a sequential interpretation). The first is the Predictive Least Squares (PLS) criterion, introduced by Rissanen (1986), which is defined as PLS(y1:n,Zn, γ) := n ∑
منابع مشابه
Bayesin estimation and prediction whit multiply type-II censored sample of sequential order statistics from one-and-two-parameter exponential distribution
In this article introduce the sequential order statistics. Therefore based on multiply Type-II censored sample of sequential order statistics, Bayesian estimators are derived for the parameters of one- and two- parameter exponential distributions under the assumption that the prior distribution is given by an inverse gamma distribution and the Bayes estimator with respect to squared error loss ...
متن کاملApplication of Linear Regression and Artificial NeuralNetwork for Broiler Chicken Growth Performance Prediction
This study was conducted to investigate the prediction of growth performance using linear regression and artificial neural network (ANN) in broiler chicken. Artificial neural networks (ANNs) are powerful tools for modeling systems in a wide range of applications. The ANN model with a back propagation algorithm successfully learned the relationship between the inputs of metabolizable energy (kca...
متن کاملRobust Estimation in Linear Regression Model: the Density Power Divergence Approach
The minimum density power divergence method provides a robust estimate in the face of a situation where the dataset includes a number of outlier data. In this study, we introduce and use a robust minimum density power divergence estimator to estimate the parameters of the linear regression model and then with some numerical examples of linear regression model, we show the robustness of this est...
متن کاملCensored linear regression models for irregularly observed longitudinal data using the multivariate- t distribution.
In acquired immunodeficiency syndrome (AIDS) studies it is quite common to observe viral load measurements collected irregularly over time. Moreover, these measurements can be subjected to some upper and/or lower detection limits depending on the quantification assays. A complication arises when these continuous repeated measures have a heavy-tailed behavior. For such data structures, we propos...
متن کاملRobust Estimation of Multiple Regression Model with Non-normal Error: Symmetric Distribution
In this paper, we develop the modified maximum likelihood (MML) estimators for the multiple regression coefficients in linear model with the underlying distribution assumed to be symmetric, one of Student's t family. We obtain the closed form of the estimators and derive their asymptotic properties. In addition, we demonstrate that the MML estimators are more appropriate to estimate the paramet...
متن کامل